Quality Assessment of Crowdsourcing Transcriptions for African Languages

نویسندگان

  • Hadrien Gelas
  • Solomon Teferra Abate
  • Laurent Besacier
  • François Pellegrino
چکیده

We evaluate the quality of speech transcriptions acquired by crowdsourcing to develop ASR acoustic models (AM) for under-resourced languages. We have developed AMs using reference (REF) transcriptions and transcriptions from crowdsourcing (TRK) for Swahili and Amharic. While the Amharic transcription was much slower than that of Swahili to complete, the speech recognition systems developed using REF and TRK transcriptions have almost similar (40.1 vs 39.6 for Amharic and 38.0 vs 38.5 for Swahili) word recognition error rate. Moreover, the character level disagreement rates between REF and TRK are only 3.3% and 6.1% for Amharic and Swahili, respectively. We conclude that it is possible to acquire quality transcriptions from the crowd for under-resourced languages using Amazon’s Mechanical Turk. Recognizing such a great potential of it, we recommend some legal and ethical issues to consider.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Speech Recognition Using Probabilistic Transcriptions in Swahili, Amharic, and Dinka

In this study, we develop automatic speech recognition systems for three sub-Saharan African languages using probabilistic transcriptions collected from crowd workers who neither speak nor have any familiarity with the African languages. The three African languages in consideration are Swahili, Amharic, and Dinka. There is a language mismatch in this scenario. More specifically, utterances spok...

متن کامل

Clustering-based Phonetic Projection in Mismatched Crowdsourcing Channels for Low-resourced ASR

Acquiring labeled speech for low-resource languages is a difficult task in the absence of native speakers of the language. One solution to this problem involves collecting speech transcriptions from crowd workers who are foreign or non-native speakers of a given target language. From these mismatched transcriptions, one can derive probabilistic phone transcriptions that are defined over the set...

متن کامل

Best Practices for Crowdsourcing Dialectal Arabic Speech Transcription

In this paper, we investigate different approaches in crowdsourcing transcriptions of Dialectal Arabic speech with automatic quality control to ensure good transcription at the source. Since Dialectal Arabic has no standard orthographic representation, it is very challenging to perform quality control. We propose a complete recipe for speech transcription quality control that includes using out...

متن کامل

A case study on using speech-to-translation alignments for language documentation

For many low-resource or endangered languages, spoken language resources are more likely to be annotated with translations than with transcriptions. Recent work exploits such annotations to produce speech-to-translation alignments, without access to any text transcriptions. We investigate whether providing such information can aid in producing better (mismatched) crowdsourced transcriptions, wh...

متن کامل

Don't Count on ASR to Transcribe for You: Breaking Bias with Two Crowds

A crowdsourcing approach for collecting high-quality speech transcriptions is presented. The approach addresses typical weakness of traditional semi-supervised transcription strategies that show ASR hypotheses to transcribers to help them cope with unclear or ambiguous audio and speed up transcriptions. We explain how the traditional methods introduce bias into transcriptions that make it diffi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011